Search CORE

8,000 research outputs found

A Recurrent Neural Network Survival Model: Predicting Web User Return Time

Author: A Graves
AG Hawkes
B Efron
DR Cox
DR Cox
DR Cox
FE Harrell
H Ishwaran
JD Kalbfleisch
JP Klein
M Han
N Breslow
R Chandra
S Hochreiter
X Cai
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/07/2018
Field of study

The size of a website's active user base directly affects its value. Thus, it is important to monitor and influence a user's likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both techniques are severely limited when applied to this problem. Survival models can only incorporate aggregate representations of users instead of automatically learning a representation directly from a raw time series of user actions. RNNs can automatically learn features, but can not be directly trained with examples of non-returning users who have no target value for their return time. We develop a novel RNN survival model that removes the limitations of the state of the art methods. We demonstrate that this model can successfully be applied to return time prediction on a large e-commerce dataset with a superior ability to discriminate between returning and non-returning users than either method applied in isolation.Comment: Accepted into ECML PKDD 2018; 8 figures and 1 tabl

arXiv.org e-Print Archive

Crossref

Reply to comment by S. Nadarajah on "Space-time modeling of soil moisture: Stochastic rainfall forcing with heterogeneous vegetation"

Author: Cox DR
Isham V
Manfreda S
Porporato A
Rodriguez-Iturbe I
Publication venue: AMER GEOPHYSICAL UNION
Publication date: 01/10/2007
Field of study

UCL Discovery

Space-time modeling of soil moisture: Stochastic rainfall forcing with heterogeneous vegetation

Author: Cox DR
Isham V
Manfreda S
Porporato A
Rodriguez-Iturbe I
Publication venue: AMER GEOPHYSICAL UNION
Publication date: 25/03/2006
Field of study

The present paper complements that of Isham et al. (2005), who introduced a space-time soil moisture model driven by stochastic space-time rainfall forcing with homogeneous vegetation and in the absence of topographical landscape effects. However, the spatial variability of vegetation may significantly modify the soil moisture dynamics with important implications for hydrological modeling. In the present paper, vegetation heterogeneity is incorporated through a two dimensional Poisson process representing the coexistence of two functionally different types of plants (e.g., trees and grasses). The space-time statistical structure of relative soil moisture is characterized through its covariance function which depends on soil, vegetation, and rainfall patterns. The statistical properties of the soil moisture process averaged in space and time are also investigated. These properties are especially important for any modeling that aggregates soil moisture characteristics over a range of spatial and temporal scales. It is found that particularly at small scales, vegetation heterogeneity has a significant impact on the averaged process as compared with the uniform vegetation case. Also, averaging in space considerably smoothes the soil moisture process, but in contrast, averaging in time up to 1 week leads to little change in the variance of the averaged process

UCL Discovery

Stormwater in Silver Bow and Blacktail Creeks: Implications for the Microbial Community

Author: Cox Dr. Alysia
Foster Jordanr
Publication venue: Intermountain Journal of Science
Publication date: 31/12/2016
Field of study

Silver Bow and Blacktail Creeks are the headwaters of the Clark Fork River and are impacted by historic mining activities in the area. Although metal concentrations of runoff into the creeks are monitored and reported in previous studies, the composition and diversity of microbial communities are unknown. We seek to identify the microbial communities present and investigate changes in community structure due to stormwater impact, thereby determining and monitoring the overall environmental health of the system. We sampled five sites in Silver Bow and Blacktail Creeks in Butte, MT for chemical and biological analyses during high stormwater flow events. Water samples were collected for analysis of major anions and cations, metal concentrations, dissolved inorganic and organic carbon and carbon isotopes and hydrogen and oxygen isotopes in water. In situ measurements of pH, temperature and dissolved oxygen were taken at the time of sampling. Redox sensitive species - total dissolved sulfide, dissolved silica and ferrous iron - were measured using wet chemical tests and field spectrophotometry. Concurrent biological samples were collected for microbial identification and diversity (DNA), activity (protein), quantity (cell counts) and culturing. Overall microbial results are in progress, but water chemistry data provide clues about microbial habitats available in the creeks. Results upstream in Butte will be compared to downstream areas such as Durant Canyon and the Warm Springs Settling Ponds. The relationship between water chemistry, microbes, and overall ecosystem health can be characterized by deciphering how water chemistry affects microbial activity and vice versa

Montana State University Library Open Journal Systems

Detecting bias arising from delayed recording of time

Author: Cox DR
De Stavola Bianca L
Publication venue: 'Wiley'
Publication date: 16/12/2016
Field of study

Sometimes in studies of the dependence of survival time on explanatory variables the natural time origin for defining entry into study cannot be observed and a delayed time origin is used instead. For example, diagnosis of disease may in some patients be made only at death. The effect of such delays is investigated both theoretically and in the context of the England and Wales National Cancer Register

LSHTM Research Online

Combining frequency and time domain approaches to systems with multiple spike train input and output

Author: D. R. Brillinger
DM Halliday
DR Brillinger
DR Brillinger
DR Brillinger
DR Brillinger
DR Brillinger
DR Brillinger
DR Cox
E Jankowska
E Jankowska
GP Moore
IA Boyd
J. R. Rosenberg
JR Rosenberg
K. A. Lindsay
KA Lindsay
M Hilbe
MH Gladden
MP Mileusnic
P McCullagh
RW Banks
SA Edgley
V Vapnik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

A frequency domain approach and a time domain approach have been combined in an investigation of the behaviour of the primary and secondary endings of an isolated muscle spindle in response to the activity of two static fusimotor axons when the parent muscle is held at a fixed length and when it is subjected to random length changes. The frequency domain analysis has an associated error process which provides a measure of how well the input processes can be used to predict the output processes and is also used to specify how the interactions between the recorded processes contribute to this error. Without assuming stationarity of the input, the time domain approach uses a sequence of probability models of increasing complexity in which the number of input processes to the model is progressively increased. This feature of the time domain approach was used to identify a preferred direction of interaction between the processes underlying the generation of the activity of the primary and secondary endings. In the presence of fusimotor activity and dynamic length changes imposed on the muscle, it was shown that the activity of the primary and secondary endings carried different information about the effects of the inputs imposed on the muscle spindle. The results presented in this work emphasise that the analysis of the behaviour of complex systems benefits from a combination of frequency and time domain methods

Crossref

Springer - Publisher Connector

eScholarship - University of California

Enlighten

ElasticMatrix: A MATLAB toolbox for anisotropic elastic wave propagation in layered media

Author: Cox BT
Ramasawmy DR
Treeby BE
Publication venue
Publication date: 01/01/2020
Field of study

Simulating the propagation of elastic waves in multi-layered media has many applications. A common approach is to use matrix methods where the elastic wave-field within each material layer is represented by a sum of partial-waves along with boundary conditions imposed at each interface. While these methods are well-known, coding the required matrix formation, inversion, and analysis for general multi-layered systems is non-trivial and time-consuming. Here, a new open-source toolbox called ElasticMatrix is described which solves the problem of acoustic and elastic wave propagation in multi-layered media for isotropic and transverse-isotropic materials where the wave propagation occurs in a material plane of symmetry. The toolbox is implemented in MATLAB using an object oriented programming framework and is designed to be easy to use and extend. Methods are provided for calculating and plotting dispersion curves, displacement and stress fields, reflection and transmission coefficients, and slowness profiles

UCL Discovery

Big data: Some statistical issues.

Author: Cox DR
Kartsonaki Christiana
Keogh Ruth H
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

A broad review is given of the impact of big data on various aspects of investigation. There is some but not total emphasis on issues in epidemiological research

LSHTM Research Online

Oxford University Research Archive

Research data management and openness: the role of data sharing in developing institutional policies and practices

Author: Dr Andrew Cox
Rosie Higman
Stephen Pinfield
Publication venue: 'Emerald'
Publication date: 01/09/2015
Field of study

Purpose: To investigate the relationship between research data management (RDM) and data sharing in the formulation of RDM policies and development of practices in higher education institutions (HEIs). Design/methodology/approach: Two strands of work were undertaken sequentially: firstly, content analysis of 37 RDM policies from UK HEIs; secondly, two detailed case studies of institutions with different approaches to RDM based on semi-structured interviews with staff involved in the development of RDM policy and services. The data are interpreted using insights from Actor Network Theory. Findings: RDM policy formation and service development has created a complex set of networks within and beyond institutions involving different professional groups with widely varying priorities shaping activities. Data sharing is considered an important activity in the policies and services of HEIs studied, but its prominence can in most cases be attributed to the positions adopted by large research funders. Research limitations/implications: The case studies, as research based on qualitative data, cannot be assumed to be universally applicable but do illustrate a variety of issues and challenges experienced more generally, particularly in the UK. Practical implications: The research may help to inform development of policy and practice in RDM in HEIs and funder organisations. Originality/value: This paper makes an early contribution to the RDM literature on the specific topic of the relationship between RDM policy and services, and openness – a topic which to date has received limited attention

Central Archive at the University of Reading

Crossref

White Rose Research Online

A Quantile Variant of the EM Algorithm and Its Applications to Parameter Estimation with Interval Data

Author: Chanseok Park
Cox DR
Dempster AP
Elsayed EA.
Govindarajulu Z.
Leemis LM.
Press WH
Robert CP
Sun J.
Publication venue: 'SAGE Publications'
Publication date: 11/05/2018
Field of study

The expectation-maximization (EM) algorithm is a powerful computational technique for finding the maximum likelihood estimates for parametric models when the data are not fully observed. The EM is best suited for situations where the expectation in each E-step and the maximization in each M-step are straightforward. A difficulty with the implementation of the EM algorithm is that each E-step requires the integration of the log-likelihood function in closed form. The explicit integration can be avoided by using what is known as the Monte Carlo EM (MCEM) algorithm. The MCEM uses a random sample to estimate the integral at each E-step. However, the problem with the MCEM is that it often converges to the integral quite slowly and the convergence behavior can also be unstable, which causes a computational burden. In this paper, we propose what we refer to as the quantile variant of the EM (QEM) algorithm. We prove that the proposed QEM method has an accuracy of

O(1/K^2)

while the MCEM method has an accuracy of

O_p(1/\sqrt{K})

. Thus, the proposed QEM method possesses faster and more stable convergence properties when compared with the MCEM algorithm. The improved performance is illustrated through the numerical studies. Several practical examples illustrating its use in interval-censored data problems are also provided

arXiv.org e-Print Archive

Crossref